Optional BatchNorm integration in NatureCNN #2132

Mahsarnzh · 2025-05-06T21:21:00Z

Description

Added an optional BatchNorm integration to the NatureCNN architecture used in the feature extractor module of Stable-Baselines3. This enhancement introduces a use_batch_norm flag to toggle Batch Normalization after each convolutional layer. This change provides a performance and stability improvement option for image-based environments.

Motivation and Context

This change will solve the exploding gradients problem and in case it is set to False it does not change anything, however if set to True it will help converge much faster and enables us to use higher learning rates.
Further than that this change allows users to optionally enable Batch Normalization in NatureCNN, which can improve training stability and convergence, especially in environments with high variance in pixel input. I initially explored alternatives (LayerNorm, GroupNorm). BatchNorm showed the best trade-off of speed and stability and convergence.

N/A N/A

I have raised an issue to propose this change (#2131 for new features and bug fixes)

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

When enabled via , this stabilizes feature distributions and reduces internal covariate shift. On Pong, it boosts avg. reward by ~2.3 points at 200k timesteps vs. the default extractor, with all existing tests still passing.

Mahsarnzh added 2 commits May 3, 2025 17:17

Add optional BatchNorm to CNN/MLP extractors

9663068

When enabled via , this stabilizes feature distributions and reduces internal covariate shift. On Pong, it boosts avg. reward by ~2.3 points at 200k timesteps vs. the default extractor, with all existing tests still passing.

Add optional BatchNorm to CNN/MLP extractors

edb93fc

When enabled via , this stabilizes feature distributions and reduces internal covariate shift. On Pong, it boosts avg. reward by ~2.3 points at 200k timesteps vs. the default extractor, with all existing tests still passing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optional BatchNorm integration in NatureCNN #2132

Optional BatchNorm integration in NatureCNN #2132

Uh oh!

Mahsarnzh commented May 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Optional BatchNorm integration in NatureCNN #2132

Are you sure you want to change the base?

Optional BatchNorm integration in NatureCNN #2132

Uh oh!

Conversation

Mahsarnzh commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Types of changes

Checklist

Uh oh!

Uh oh!

Mahsarnzh commented May 6, 2025 •

edited

Loading